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X* Xntradufe felon 

For the past few tnonfchs J h&v« b#en studying the probltia 
of how to uatee a ctwaputsr understand linguistic information (in. 
some generally ac^cpt^d s*nee of *underafeand B )* X have listened 
to oour89B on Linguistic Structure (Dr. GhoisslQ?) and ^jg^^nical 
^§^5l^i25 ( Dr * Zftgv*)^ and r**ad. Sfts works of various Unguis Us, 
logician, psychologist*, &t$ computer prograaftcara on sublets 
-rarjgins through aewantloft, information ratri&valj and rafcehanloal 
translation* 

Ths remainder of this papar is divided infco t:wo parts; 
a survey of various ideas and results appearirLg in thft current; 
literature (with aom£ editorial cowmen* 5 s and a proposal for future 
work to incslud^ a compjfc#r ayafcfcBi for afcorlng and sxtraofelirag : ---srcairtle 
Information, 

l'Z» Background 'Literature 
Av general ^Sgfta.nt.lca 

"Semantics p ' la generally etudlad from one of two view^sJjntBs 
P'4T4- or d«s<srlpfeiv0- Pure aetflauVica* ths hind atudiied by Camaw* 
d^&la with th* proper tie© of ari;IfIeially-oane trucked formal sygnsraa 
(which may or rnay not have analogues in zhe real world ) s with r&ipect 
t?o nilsa for senfc&nce formation and deaignasion of formal modal* and 
truth values.. We shall rathei* be concerned with deact Ipfri v e semantics, 
an etqslflcal search for rulee governing truth and weardngf uln^aa of 
sentences In natural language* 

We quickly encounter the paradox of having bo uae words 
with which to dl&cuas the nraanlnK of words * in fast even of the 
vcord Waning ,: * Any attempt to distinguish between object language 
?*nd tmita-l%Qguajp& s^ems a trained and artificial* A cora&on d^vle^ 
is to define "meaning" wish a vary specialized sense, or rofuso that? 
it aan be defined altogether.. Gtuln&, tongue in cheek, recognisofr 
fchls difficulty in the following paragraph " 

*0n© rouafc r-ero^wber that an exprcjsiTn [ 8 meaning (if we are 
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to admit; nueh things as raesuiinss) la not; to be confused with fcihe 

crtjJiKtj ii a^iy* that the esfpreaajon dosig;na^Sr Ssntttnces do not 
d-wlgQatw at all, p., though words In ttoem mays a©Wb«»es ai i £ simply 

not singular fce^me^ Eu& Bent*?r.o*$ still fcavo *A&anlngs (if ue admit 

sich things r# tuaaningsJjj'arKl She aiea&lng of an etez-nol sentence 

la th* object, deaignaftgd by the singular fcerai found by bracketing the 

e^ntflnctf. ^hx*; ajtrygular terra will fccive & meaning In turo (tf we 

B 7 --> prodigal *=i sugfi wiv^h rceanlriga ) f b"ut it .rf.ll prosuDfe.bly b<* 

soisrathing forth *r* Ohder this apprc&ob th# mining (if such teVjere 

■fc-a) of Eh# non-t .teraaX s^ntonce ''Ihe door I* open* la r:ot a 

proposition*** fc — lfflpiying that the Illusive xnssaiitts of "TJha doo? 

in op-En, fr is done coropiVete i;ituifciv-c &<5fe of circumstances surrounding 

a particular occasion on wfcuch tht> sfcak^nien^ fJ T*W door ia o$&n t n wa& 

uttered.. Clearly this kind of concept does not lcnG Itself to 

computer usage „ 
■a 
2'j.f? is nor*- precis© in snaking the following distinctions 

words a»y hav* sr«ea\tng (but not significance)! utterance, (phraser, 

s<*nteivfi«3a) «ny havt. significance {touts not mining),. However ha 

statas that an analgia of feha significance of a whole utterance 

£.mnot be consist*** trithout an analysis of the rosaninga of che uro^dfc 

I'-i tte utterance- 

h 

Anouhsr similar app^o&ch 18 that of fJlioawi who considers a 
word as ths sraallenfc significant unit with isolated "consent" 
wft&reaa phrases, stnfcen&ea, etc., axpreas £jl*ttiona between thi? 
things which ar > syribolla&d by individual mrds* Here 'Waning B is 
defined as "a reciprocal relationship between th£ narae find tha sense* 
?;hic)h enables the an& to call up *ft* other. 1 * By "sense* is ctoojit 
Che thought or jefer^nce to an object or association (referent) wtxloh 
is pepros«fft;ed ly th& word (symbol }, Jtote *;iutt meaning hers relays 
word with thought about object, not necessarily with object ita«lf D 

In this ookl:- action l^t us ro^nfeicn Walpole 1 s B S5 definition 
routes*" fix observes tsfeat a word m& 1» de'ined by direct 
eymbolisatlonj L.* B asGoolablug At with an observable object but %1fo 
&ny oth<i:r kind of association, connection, or characteristic^ puch 
as Xoc&tion or Pbatw or legal relation, can bii us-sdj and fcha precise 
connect-' on is g^n^pally described \ T erbally* Wslpoi^ alno notes (and 
w# shall us^ thts fact lafc^r) that soma word relationships, ouch 
a?i whol« to part:, o? ge?nr.rCLi7&tlon to ap«clsil case^det^rodLn^ par*iitJ. 
or^derlnsB of lr*rge olaesea of words {e^peotally co*wnon nouns) infeo 
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trw structures*. However the class of abstract nouns or ''fictions, ■ 
which So not ns.M* any object in any specific Barss-rrap^rlftflcej. do 
not land thsn3B«lv*»B fco such ordering - 
B» flpaHjroar .a nd Jtoffiigg _ 

Thus far wc ftav# discos 3d in Darling while Ignoring tfta 
gr&iurcar (syntax) of language ~ bull clearly sfrair.^ar must; bs 
considered since an ur^raHHtuftical asnt^c is not- likely to b« 
very Hitsnlngful* 

A ^gresntnar*' is usually dt-finfed as a s<$t of rules for 1 
generating tha grammatical scntsno<33 of a langu^-s, and none of 
fcfoe ungraaimatlcal onee* Deriving a craw.a*' is an empi^icaX buslXKBSj 
&inc# the ultimate fc*st of whether a a tat anient la grammatical op 
rot is to ask a native speak©!'* Considering only the "uncfcional 
rales cf words In snntdnceB {their :t p&rt3-of- speech 17 )., but not 
thsir mattings In any s^nae, Chcnraly develops various n^>d^lB for 
English g^nnar: p^irazfc structure it* a simple concept slxx! works i'o;* 
a srnall part of tilw language, but i& generally inadequate; trans - 
fonsatlonal scheme*; ar<* probably ad"?qoafcc> but aro complicates «4 
difficult to comfcil*feo or test* 

Although syntactics purports to ignore meanings., thf? 
bou^jdary line between gvwflfiiar %&& s«rantlce is quits h.tzy. Pcjp 
example, «o»e 13.ngid.sts classify eh** so-called ^maas nJuns H {e*3* fr wafcer *) 
aa a s£pa_*ate grammatical group slime they doii l t featce the abides 
however the distinction bstws^n H S want sweat" *r*d "I aunt a j&teafc* 
is g&mraily considered to be a stfRttrctie one. 

%$££ d#fin®a nitaningfullnese In torus of rigidity of Bramiafejaal 
structure, Words which arc n&cesst&ry In a particular gr&fflmatic&2 
conf iguration, such aa fregosnfc occurrences of ll to, * ^do,* *th*s, E «fcc«, 
s,r& said to have no wsaniijE* On tha? oth«r h&nd htoihIs v.hieh could 

t5 ^oplacad by a large nufttbsr of alt«rnafciv£& within a given 

T 

jiraEnis&tlcal context are considered very meaningful a Sjimnons wak^fl 

fctils distinction between function words and content woida even mere 
aft&rp, as we shall ses l&t&r* We shall accept th&sa notions only to 
the axtisnt that they do itidicate a close connection b&twaen grfxuraiafc* cal 
and s^mantio classifications* 
C a EKiatlBg CoBputgr .^rog raqg 

Several computer progranis la^va besn writtsn in fch^ past f«w 
y^ars which arw relevant to the present diiSCa&Hion c Soire of the. 
faatuxes of thes*-3 ara sutuwariz^d bcJoWu 



i) Feldinan 5 s analysis for siiaple English- (8) ThU program 
designed to trans lafc<s eifliple Rziglish coEmanoe into tooling 
machine Inst ructions, performs titus neceeasry syntactic analysis 
ovs a certain elaiss of simple sentence* which are llraited to a Binall, 
prescribes vocabulary* Vocabulary ifceros are Identified with part- 
of-sptjechj and certain are isartted as "key" words v 

2) ftHlllps 1 "Question- Answering Routeim*"^ A&*in a dictionary 
of all ner»d^d words and thsSr parts of speech la provided . 
Sentences are analysed on the basis of a phrase ~s true tttte grammar, 
ard placed la a canonical tabular forra ¥ Th« auastlon Is* aeauwted 
to involve only :-«rcair factors, and is trans \'oraed to b* rcatcheti 
against the sentences in thfe test, until fchs appropriate lnfortaatlan 
1? i found. 

3) Simmons 1 grammatical coding* 1 By siskins proper iu>t« of 
"function" words (prepositions, auxllliary ^rta, conjunctions, etc, J 
which are ktiya to graiwwtical structure, suffixes, and empirical 
rules of grammatical context {allowable pairs and triplets of 
syntactic forms), this program can scan wrtiltrary fc>2xt at a high 
rate of speed and tag each word with Its appropriate pert of speech, 
with a hljh degree of accuracy „ 

k) S7WFHEX* This systam, design ad to *synthcsige huw&n 
language behavior,* starts by scanning arbitrary text, perf orbing 
the gratwatlcal coding described ;.n 3) above and indexing all 
occurrences of cont ent, {nonfunction) words* Wh«n & question la 
read, its content words are identified and relevant auctions of texS 
extracted by use of the index. An answer is then coaipos&d baa^i on 
a matching process similar to Phillips', although probably 
somewhat more elaborate* 

5) "Baseball" This program, written In the IPL-V pro£2?ai»nir g 
language, is designed to answer any reasonable verbal English question 
about the results of a set of baseball games-: ths data la a certain 
tree structure containing all thtf information abo^t all games. 
Questions ar* analysed syntactical iy with fch* sld of a dictionary, 
and the resulting forms completed by references t-> the data structure, 
labulat^cl answer information is then printed out* The- dictionary 
has a set of values of certain attributes fo:r each word, such ae 
part of speech, lAe^her part of an idiom, and "meaning "• K Meanlng, B 
which only appsarfr for certain word?, la canonical fcmnelatlcn 
within the context of the program* e.g. the Cleaning of *Viho" is 

^reaflft*^ " This procsdiire is adequate for the problem, buc would h& 
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difficult to gen.: railing for application to wider contexts.: 

6) .labia's semantic struo fctcoe , SnMe o one tract a a aet- 
inoluaion -eree for storing itteiae in a United technical vocabulary, 
Ehs Information retrieval problem is eiwplified by using. a eiPipl© 
saheMa for coding the location of iteias on the fcra© by level and 
nods count, and information about related it arts eon bi= obtained 
from their tree locations* Our proposal bolder will Involve a 
general application of similar seramfcio trye-sfcructure principles* 

It is interesting to note shat all the above progryjnrt at 4 *, 
baued on algorithmic syntactic schemes* While it is presently 
generally recognizee that use of aeraantic notions will be essential 
for difficult language- ^'©ceasing problems* useful specific 
prograrimabU achrcmsa basicl on semantic ideas rave thus far been too 
elusive, the cur r art s-;ate of this phase of the problem le discussed 
in the following section. 
£■> „ ^ jfcjplic; i tj^na of S&ir nntl ct* « 

Wia necessity for having semantic information available has 
become most apparent in resent wcj.«k en mechanical translation of 
natural language , dictionary look-up and syntactic context schnws 
have failed miserably, and the only hf#lp a bilingual hwran can glv- 
vOnsa astaed bis tran&latlon method is to assert that tie reads in one 
language, "understands" what was read, and then re-expresses the same 
"idea" in the new language, it is apparent that the sarc-e processes 
are involved in humt.n reading comprehension and question -answering 
procedures and are therefore probably the key to efficient 
language processing by machine. 

In order to understand "understandiig,* we should look at what 
is involved in learning new words,* Quint*" describes fchroa basic 
approaches t 1 pointing in isolation {"stimulus meaning"; j 2) from 
context {performing this necaaaary inductive inference}? 3} by 
description (definition — in terms, of other words), 3 feel thsX 
-*ihil« the first is the most fundamental, the third approach is by 
far the most important in building a vocabulary and recalling: 
"meanings. 1 "' Ziff 3 defines semantic relations as ".,. correlations 
between types of events, or a type of event and a state of affairs,, or 
a typs of word and a type of thing, cr a word and a thing, and so 
.forth,'' Bbwevsr since types and states are generally as a eclated with 
ap'-sciflc words, all af the above oxoep* the last nay be considered 33 
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in a language system, they -enter into all kinds of groupings told 
tagether by a conpiex, unstable and highly subjective network of 
associations : associations between the nairea and the s^ns^n, 
asQoclationa based! on similarity or some othfcr relafcbn, It la by 
their effects that these associative connections matte uhotnselvea 
felt*v**Th*> sum total of th$s« associative networks is the 
vocab ulary. :t 

Oi;;: -: T.-i y io -:.\;r;l u : J h the prohlsw of atfiioantics rx-vb-v. !■■- *o 
avoid it by translating ordinary language Into a forma;', system which 
could b^ bundled syntactically* * Uiiforfc ornately this procedure, if 
feasible at all, would obscure the r*al problems in a r^aas of detail*^ 
d^eunentation and notation. and bo of little general v&.lu»« At firafc 
view Preudenbhal f s UNCOS raay 3»eR* Hire a formal ays tarn for 
d-aocrlbin^ human hssfcavlorj but it is actually quit* fair from such, 
and assume© far greater abilities for iadoefcive inference of rules 
and situations on th# part of the receiver than ia aspc^ted of the 
usual lan;;u.iga student* 

IB 

Quillian J is attesting to represent the semantic content of 
words ac Sfcfca of "concept© 1 ', which could be combined to represent the 
£fr?aaings of phrase* and sentences * With She baaic premise that: 
learning a new word involves measuring it-a values on & get* of battle 
arsalesj ha is tiding to fcuild up a repertoire of suitable coordirti-j^ 
srsalsa (which are generally intuitive, urctdlrc&nslonal coordinates i 
*..g* length, time, hua, etc*}, ai*d code the corresponding representations 
of English words* He also psrraita defining words in fcfcrma of 
p^edsfinod Korda as coordln^t^a, Soaie of this wor* la bfcittg ^rogm&raed 
for the computer in %h$ CGMIE system* My fueling ia that the 
relations befcwgon words ia more important than the: conceptual meaning 
of ^rtividi&l^o'rds in terras of something more b&sic (*-3Suffiin;i each 
can be suitably found)* find therefore a simpler approach which i#norc?E 
huso meanings might b* more immediately fruitful. 

Somen " is mor& specifically concerned ittth perrcieaable word 
cGKtimtions.. Ee first describes a hierarchy of sentence typ^ei 
l) Uttgr*HHTOutical| 2) Grammatical but nona<*nu-4j 3) Sensible but false, 
4) True* Re then argues that tta crucial aaauntic distinction lies 
between fcha grammatical declarative sentences which *re ranagnse, 
3.rd khoee which are significant (but may ba true or fal;*e}* Any pair 
of rronafiic prtdicataa F^ P g are said to have a sense Villus i/(Pi*P ? ) 
it there exists any significant ssn^ence conjoining thert. Otherwise 



fch?y hffra valufc*ylk K(P^,? 2 ) a Tfee rotation 1e eynzwtric and ia 
pr^amwad wonder the usual logical operations en its Arguflsirse, but 1r 
not ftr&noi&iva* A atrongsr a ana* raJ&tion Q ** P is true if "of (what 
is)P, ib can ba &igniflcanfcly said thab It la Q«*»tt A g+ r *> pr:fra* 
Miiat^i*, Q, = quick 91 . This permits the arrang<*mente of these 
"monadic predicates" into a almplc tree, whnr© all words In tfta 
aaj)e wn rafting class ?V.g* all colons, or all s/ordo d*scribin& vreigab) 
■■/■_ ;upry fch-c aafflg nocUi* My main objection to t:MB work i& in tthtfz'e 
£b4 important distinctions lie* jSovm^ra would argue that R rhe id«$ 
i& alwogra &r£&n p ia noiiB^na^^ but * fchs s^/ is always gre#n w ia 
aenaible (sines alcy may have colov, for "the elcy is blue* and therafore 
n 'trt« afey 1$ not blufc* are signific-jit), alslnugh faiae- Ifofce fehat 
**ideas cannot bs grssn would b<? considered nora^n^ rathor than 
tiM*^ by SoEnmers* I feel th*s distinction between fJ nons<nfis iS and 
Vijnsibj.g but not true of tho real world* so be too hazy to b^ fcti^ 

baa in for a semantic aystem {such &a that baaed on the U * nd +- ralatic >n*)- 

17 

A new booic by Nida" diseuas^e aovapal £ypsa of possible word 

associations* Chain associations,, iut?* linear ordering of terras 
auoh as nu&i$rala or colons of the sptcfcr-Jirij ;save very limited 
applicability * Hierarchical ordering according to generality 
(ciaas inclusion) ie saaontlally one s shams y/liieh w<e shall adopl , 
wish modifications., I don't bsil^ve as ffida do* a that pronouns hays 
a place in the ftisrurcfty aa v^ery gen&ral feartfa* R&ther, tfe^y should 
b* replaced by their antec edanta » Constituent analysis is a schom-? 
aij&il&r to Qailliant, and again requires Wrs difficult choice or 
coordinate and aaaigttnrent of values* It la difficult to distinguish 
between Kida 2 s "four basic typca of eeffles as fundamental ooiqporrarta 
itt semantic structure* and certain well-knowa gr^TO&fciCfll p&rfcs of 
a&eaofi-asfiqpt that the rulss for s#santie parsing of astotencso are 
ttuch reor^ obecure* 
XtX* Proposal 

l^et us now review the major points *3*tab!iaftea In the preceding 
paragraphs s 

1} Soh«d»ae baaad on fcha syntactic analysis of English l&nguag* have 
be.^n aucoesftfully pr^ogrHJonad to solva certain 3imttad Unguistio 
problana* 

2) Quoh schemes ars not adequate , In than given, for any larger, 
mOT-i? general probl^n^n, 

i) Qowavi&r, syntactic an&lysia (gratra;Hj? } must be an itriportant 
pavt of anji- bkw« g«naral »ethod v 



4) Ondere (sanding seems to be based largely on various 
associations hett/em. worlSc 

5) N^ne sf the semantic" analysis schemes proposed thuB far 
can be re.il liable on a ooirputer in the near future, largely because 
they involve procures which are too vagus and general, in conneetinr- 
wlth problems which are too large.* 

My feelings with regard to point 5) above are based on the 
following heuristics *Large problems are easier to solve if the 
solutions to smaller, similar ones are available," Here "similar R 
ia us«d in its brotdest most ambiguous sense* If we are lucLcy the 
small problem will turn out to be a subproblera of the large ob& ( Of 
a special ease wit* obvious routes to generalisation (although this 
might not be apparent at the outtf-sfcjj however the solution will 
almost always suggest mothoda which should be triad on the large problem 
{or, jufcfc a .s import ant, methods which should not be tried)* 

2 propose to develop a conputer program Tshich will have the 
aoilily to utilize semantic ''backrgrounn* 1 information while answering 
Questions baaed on available text materiel. Ttarts will bo limited 
to a certain small vocabulary and simple sentence structures, probably 
selected from children's literature- Grammatical Information such 
aj phrase structure rules and the pa rt-of- speech labeling available 
from Simmon 'a program 10 will be available to the system.. The 
semantic information will be implicit in the organization of the 
"dictionary* which will contain divisions and associations of both 
syntactic and semantic natures, as described below. 

Words will first bo classified by part of speech. Bouns will 
then be arranged in a hierarchical structure tastd on g anerality 
class inclusions (this will probably involve re-entrant tre* structure}. 
Of -her associations will also be indicated between words in different 
parts of the tree, such as part-whole relationships,, or Just 
abstractly "related 5 (i.e. likely to appear in the same sentence, 
eg. "dog* and *feovM«ow*}, 7.'o some extent a verto tree sen also be 
3 i: up, with We" hranchlr^ into ^/&ik, f: 'W, •' ; ,.r.:; Spin, 1 foi 
example. 

Connections can then be made between nodes in the noun tree 
and nodes in the verb tree, indicating, say, "it makes sense to uae 
any noun below this point as the subject (or object) of any verb 
•- low that corresponding point. * Similar structures could be set ltj> 
within the classes erf adjectives and adverbs, and between them and the 
nruna and vt-rba which they may modify.. 
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This semantic dictionary will be? prvp-irGd In advance find 
6vallefrl4 to th« inifeiel a tat** of thv program* Jt could conceivable 
fce ussd I -ten to generate? pandora English etmtonttGBj, all . pf_iftiAca . Kgg 
t-e _ tyi re q f th 1 _*'gfe I world « Hcwavsr^ tfta qtHsafc* f on-9nHWttrlng pirogtiduF* 
tould be as follows: 

1) A» the taxt Is read-, a ^thre^d^ ia inserted into the 
fiicfcion&ry which fallows fcht? actual tvente o* the text, 

2) If ^impoaaibl© 15 word rel&^iona op fisw vocabulary app^&v,, 
fchs program will csoraplain, 

3) Wwn a question is rs Wisely the vovdn in the qi^stion^ V ■ 
text thread, and fchs structure of the diaSrfoiiazty &r© sul available 
to be ua-ed while conposlng en stroma* 

AosMguous worde, obliqua ow?antn£8 3 sm3 abstract concepts will 
be avoided or ignore : The purpose of fcfcJs study will b<& 

1) to defcemains nhettiasp it in possible to stora a significant 
amount °f 0*«afikia InfaHiutlORj by means cf th« fcyp* »f flf-rucoure 
tieecribsd. In a rejaonaSltf amount of spa3*v, 

#) To (totftftftlne tha nature of the *«ai*ch and analysis pxmedorffls 
fcfcich ar* no«EMlfld to utilise twosfc efficiently this available 
aoaantio Information* 

X shall soon at&*£ p^epariaig detailed flow -charts ffca? thift 
program.. I would uppr-eci&ts hearing whkteter idai%ti or aug&g&tilons 
anyone wii$ht have. 
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